24 research outputs found

    A programmable architecture for the provision of hybrid services

    Get PDF
    The success of new service provision platforms will largely depend on their ability to blend with existing technologies. The advent of Internet telephony, although impressive, is unlikely to make telephone customers suddenly turn in favor of computers. Rather, customers display increasing interest in services that span multiple networks (especially Internet Protocol-based networks and the telephone and cellular networks) and open new vistas. We refer to these services as hybrid services and propose an architecture for their provision. This architecture allows for programming the service platform elements (i.e., network nodes, gateways, control servers, and terminals) in order to include new service logics. We identify components that can be assembled to build these logics by considering a service as a composition of features such as address translation, security, call control, connectivity, charging and user interaction. Generic service components are derived from the modeling of these features. We assure that our proposal can be implemented even in existing systems in return for slight changes: These systems are required to generate an event when a special service is encountered. The treatment of this event is handled by an object at a Java Service Layer. Java has been chosen for its platform-neutrality feature and its embedded security mechanisms. Using our architecture, we design a hybrid closed user group service

    An Architecture for the Integration of Internet and Telecommunication Services

    Get PDF
    In this paper, we propose an architecture for hybrid services, i.e., services that span many network technologies, especially the PSTN and the Internet. These services will play an important role in the future, because they leverage on the existing infrastructures, rather than requiring brand-new and sophisticated mechanisms to be deployed. We explore a few issues related to hybrid services and propose a platform, as well as a set of components, to facilitate their creation and deployment. The existing infrastructure is only required to generate specific events when requests for hybrid services are detected. We present the design of s service layer, based on Java, that handles the treatment of these special requests. Our service layer is provided with a set of generic components realized as Java Beans. Hence, we can provide hybrid services without changing the existing infrastructure. We illustrate this strength of our architecture by discussing the call forwarding service

    Optimizing simultaneous autoscaling for serverless cloud computing

    Full text link
    This paper explores resource allocation in serverless cloud computing platforms and proposes an optimization approach for autoscaling systems. Serverless computing relieves users from resource management tasks, enabling focus on application functions. However, dynamic resource allocation and function replication based on changing loads remain crucial. Typically, autoscalers in these platforms utilize threshold-based mechanisms to adjust function replicas independently. We model applications as interconnected graphs of functions, where requests probabilistically traverse the graph, triggering associated function execution. Our objective is to develop a control policy that optimally allocates resources on servers, minimizing failed requests and response time in reaction to load changes. Using a fluid approximation model and Separated Continuous Linear Programming (SCLP), we derive an optimal control policy that determines the number of resources per replica and the required number of replicas over time. We evaluate our approach using a simulation framework built with Python and simpy. Comparing against threshold-based autoscaling, our approach demonstrates significant improvements in average response times and failed requests, ranging from 15% to over 300% in most cases. We also explore the impact of system and workload parameters on performance, providing insights into the behavior of our optimization approach under different conditions. Overall, our study contributes to advancing resource allocation strategies, enhancing efficiency and reliability in serverless cloud computing platforms

    FfDL : A Flexible Multi-tenant Deep Learning Platform

    Full text link
    Deep learning (DL) is becoming increasingly popular in several application domains and has made several new application features involving computer vision, speech recognition and synthesis, self-driving automobiles, drug design, etc. feasible and accurate. As a result, large scale on-premise and cloud-hosted deep learning platforms have become essential infrastructure in many organizations. These systems accept, schedule, manage and execute DL training jobs at scale. This paper describes the design, implementation and our experiences with FfDL, a DL platform used at IBM. We describe how our design balances dependability with scalability, elasticity, flexibility and efficiency. We examine FfDL qualitatively through a retrospective look at the lessons learned from building, operating, and supporting FfDL; and quantitatively through a detailed empirical evaluation of FfDL, including the overheads introduced by the platform for various deep learning models, the load and performance observed in a real case study using FfDL within our organization, the frequency of various faults observed including unanticipated faults, and experiments demonstrating the benefits of various scheduling policies. FfDL has been open-sourced.Comment: MIDDLEWARE 201

    and

    No full text
    Since many Internet applications employ a multitier architecture, in this article, we focus on the problem of analytically modeling the behavior of such applications. We present a model based on a network of queues where the queues represent different tiers of the application. Our model is sufficiently general to capture (i) the behavior of tiers with significantly different performance characteristics and (ii) application idiosyncrasies such as session-based workloads, tier replication, load imbalances across replicas, and caching at intermediate tiers. We validate our model using real multitier applications running on a Linux server cluster. Our experiments indicate that our model faithfully captures the performance of these applications for a number of workloads and configurations. Furthermore, our model successfully handles a comprehensive range of resource utilization—from 0 to near saturation for the CPU—for two separate tiers. For a variety of scenarios, including those with caching at one of the application tiers, the average response times predicted by our model were within the 95 % confidence intervals of the observed average response times. Our experiments also demonstrate the utility of the model for dynamic capacity provisioning, performance prediction, bottleneck identification, and session policing. In one scenario, where the request arrival rate increased from less than 1500 to nearly 4200 requests/minute, a dynamic provisionin

    Enabling Efficient Placement of Virtual Infrastructures in the Cloud

    No full text
    Part 5: Big-Data and Cloud ComputingInternational audienceIn the IaaS model, users have the opportunity to run their applications by creating virtualized infrastructures, from virtual machines, networks and storage volumes. However, they are still not able to optimize these infrastructures to their workloads, in order to receive guarantees of resource requirements or availability constraints. In this paper we address the problem of efficiently placing such infrastructures in large scale data centers, while considering compute and network demands, as well as availability requirements. Unlike previous techniques that focus on the networking or the compute resources allocation in a piecemeal fashion, we consider all these factors in one single solution. Our approach makes the problem tractable, while enabling the load balancing of resources. We show the effectiveness and efficiency of our approach with a rich set of workloads over extensive simulations

    Dual Bus MAN’S with Multiple-Priority Traffic

    No full text
    The IEEE 802.6 standard for metropolitan area networks does not provide multiple priority traffic for connectionless data services. A priority mechanism that was considered in earlier versions of the standard showed to be not effective. As of now, there exists no protocol for multiple access dual bus networks that is able to implement preemptive priorities and, at the same time, can satisfy minimal fairness requirements for transmissions at the highest priority level. In this study, a protocol with strictly preemptive priorities, i.e., a protocol that does not admit low-priority traffic if the load from highpriority traffic exceeds the capacity of the transmission channel, i

    Performance Management for Cluster Based Web Services

    No full text
    We present an architecture and prototype implementation of a performance management system for cluster-based web services. The system supports multiple classes of web services traffic and allocates server resources dynamically so to maximize the expected value of a given cluster utility function in the face of fluctuating loads. The cluster utility is a function of the performance delivered to the various classes, and this leads to differentiated service. In this paper we will use the average response time as the performance metric. The management system is transparent: it requires no changes in the client code, the server code, or the network interface between them. The system performs three performance management tasks: resource allocation, load balancing, and server overload protection. We use two nested levels of management mechanism. The inner level centers on queuing and scheduling of request messages. The outer level is a feedback control loop that periodically adjusts the scheduling weights and server allocations of the inner level. The feedback controller is based on an approximate first-principles model of the system, with parameters derived from continuous monitoring. We focus on SOAP-based web services. We report experimental results that show the dynamic behavior of the system.
    corecore